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Description 

[0001] The present invention relates to data communicattoris. !n particular, the present invention relates to creating 
a unique audio signature. 

E(K)02] Digital audio technology has greatly changed the landscape of music and entertainnnent. Rapid increases in 
computing power coupled with decreases in cost have made it possible individuals to generate finished products hawng 
a quality once available only in a major studio. Once consequence of modem technology is that legacy media storage 
standards, such as reei-to-reet tapes, are being rapidly replaced by digital storage media, such as the Digital Versatile 
Disk (DVD), and Digital Audio Tape (DAT). Additionally, with higher capacity hard drives standard on most personal 
computers, home users may now store digital tiles such as audio or video trades on their home computers. 
[0003] Furthermore, the Internet has generated much excitement particularly among those who see the Internet as 
an opportunity to develop new avenues for artistic expression and communication. The Internet has become a virtual 
gallery, where artists may post their works on a Web page. Once posted, the works may be viewed by anyone having 
access to the Internet. 

[0004] One application of *e Internet that has received considerable attention is the ability to transmit recorded music 
over the Internet. Once music has been digitally encoded into a file, the file may be both downloaded by users for play, 
or broadcast {"streamed") over the Internet. When files are streamed, they may be listened to by Internet users in a 
manner much lil<e traditional radio stations. 

[0005] Given the widespread use of digital media, digital audio files, or digital video files containing audio infomiation, 
may need to be identified. The need for identification of digital files may arise in a variety of situaticHis. For example, an 
artist may wish to verify royalty payments or generate their own Arb(tron®-iike ratings by idenfifying how often their works 
are being streamed or downloaded. Additionally, users may wrish to identify a a particular work. The prior art has made 
efforts to create methods for identifying digital audio works. 

[0006] However, systems of the prior art suffer from certain disadvantages. For example, prior art systems typically 
create a reference signature by examining the copyrighted work as a whole, and then creating a signature based upon 
the audio characteristics of the entire wort<. However, examining a work in total can result in a signature may not accurately 
represent the original work. Often, a work may have distinctive passages which may not be reflected in a signature 
based upon the total work. Furthermore, often works are ^ironically processed prior to being streamed or downloaded, 
in a manner that may affect details of the work's audio characteristics, which may result in prior art systems missing the 
identification of such works. Examples of such electronic processing include data compression and various sorts of 
audio signal processing such as equafization. 

[0007] US 5,91 8,223 discloses a method for determining a work in which the work is segmented, a signature generated 
and compared to a reference signatajre to detemnine if the work is known. 

[0008] Hence, there exists a need to provide a system whkih overcomes the disadvantages of the prior art. 
[0009] The present invention relates to data communications. In particular, the present invention relates to creating 
a unique audio signature. 

(001 0] AccorcBng to a first aspect of the invention there is provided a method for determining an identity of a sampled 
work, said method comprising receiving data of a sampled work, segmenting said data of said sampled work into a 
plurality of segmente wherein each of said segments has predetermined segment size and a predefemiined hop size, 
creating a signature of said sample work based upon said plurality of segments, comparing said signature of said sampled 
worit to a pluraSty of signature of reference works, and detennining sad sampled work is one of said reference works 
based upon said comparison, said method characterized in that said predetermined hop size of said segments of said 
sampled work signature is chosen to be less than said hop size of each of said plurality of reference signatures. 
[0011] According to a further aspect of the invention there is provided an apparatus that determines an identity of a 
sampled work, said apparatus comprising circuitry configured to receive data of a sampled work, circuitry configured to 
segment said data of said sampled work into a plurality of segments wherein each of said segments has predetermined 
segment size and a predetermined hop size, circuitry configured to create a signature of said sampled work based upon 
said plurality of segments, circuitry configured to compare said signature of said sampled work to a plurality of signatures 
of reference works, and circuitry configured to detemiine said sampled wori< is one of said reference works based upon 
said comparison, said apparatus characterized in that said predetennined hop size of said segments of said sampled 
work signature is chosen to be less than said hop size of each of said plurality of reference signatures. 
[0012] According to a further aspett of the Invention there is provided a program storage device readable by a machine, 
tangibly embodying a program of instructions executable by the machine to perfomi a method as described above. 

Figure 1 is a flowchart of a method according to the present invention. 
Figure 2 tea diagram of a system suitable for use with the present invention. 
Figure 3 is a diagram of segmenting according to the present inventton. 

Figure 4 is a detailed diagram of segmenting according to the present invention showing hop size. 
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Figure 5 is a graphical flowchart showing the creating of a segment feature vector according to the present invention. 

Figure 6 is a diagram of a signature according to the present invention. 

Figure 7 is a functional diagram of a comparison process according to the present invention. 

[0013] Persons of ordinary skill in the art witi realize that the following description of the present invention is illustrative 
only and not in any way limiting. Other embodiments of the invention will readily suggest themselves to such skilled 
persons having the benefit of this disclosure, 

[0014] it Is contemplated that the present invention may be embodied in various computer and machine- readable data 
structures. Furthermore, it Is contemplated that data structures embodying the present invention will be transmitted 
across computer and machine-readable media, and through communications systems by use of standard protocols such 
as those used to enable the Internet and other computer networking standards. 

[0015] The Invention further relates to machine- readable media on which are stored embodiments of the present 
invention. It is contemplated that any media suitable for storing instructions related to the present invention is within the 
scope of the present invention. By way of example, such media may take the form of magnetic, optical, or semiconductor 
media. 

[001 6] The present Invention may be described through the use of flowcharts. Often, a single instance of an e mbodiment 
of the present invention will be ^own. As is appreciated by those of oroHnary skill in the art, however, the protocols, 
processes, and procedures described herein may be repeated ccHitinuousty or as often as necessary to satisfy the needs 
described herein. Awjcordingly, the representatton of the present invention through the use of flowcharts should not be 
used to limit the scope of the present invention. 

[0017] The present invention may also be described through the use of web pages in which embodiments of the 
present invention may be viewed and manipulated. It is contemplated that such web pages may be programmed with 
web page creation programs using languages standard in the art such as HTML or XML. It is also contemplated that 
the web pages described herein may be viewed and manipulated with web browsers running on operating systems 
standard in the art, such as the Microsoft Windows® and Macintosh® verstons of Internet Explorer® and Netscape®. 
Furthermore, It Is contemplated that the functions performed by the various web pages described herein may be imple- 
mented through the use of standard programming languages such a Java® or similar languages. 
[0018] The present invention wiH first be described in general overview. Then, each element will be described in further 
detail below. 

[0019] Referring now to Figure 1, a fkjwchart is shown which provides a general overview of the present invention. 
The present invention may be viewed as three steps; 1) receiving a sampled work; 2) segmenting the work; 3) creating 
signatures of the segments; and 4) storing the signatures of the segments. 

Receiving a sampled work 

[0020] Beginning with act 1 CX), a sampled work is prowded to the present invention. It Is contemplated that the work 
will be provided to the present invention as a digital audio stream. 

[0021 ] if should be understood that If the audio is in analog form, it may be digitized in a manner standard in the art. 
Segmenting the work 

[0022] After the samfrfed wori<ed is received, the work is then segmented in act 1 02. It is contemplated that the sampled 
work may be segmented into predetermined lengths. Though segmente may be of any length, the segments of the 

present invention are preferably of the same length, 

[0023] In an exemplary non-limiting embodiment of the present invention, the segment lengths are in the range of 0.5 
to 3 seconds. It is contemplated that if one were searching for very short sounds (e.g., sound effects such as gunshots), 
segments as small as 0.01 seconds may be used In the present Invention. Since humans don't resolve audio changes 
below about 0.018 seconds, segment lengths less than 0.018 seconds may not be useful. On the other hand, segment 
lengths as high as 30-60 seconds may be used in the present invention. The inventors have found that beyond 30-60 
seconds may not be useful, since most details in the signal fend to average out. 

Generating signatures 

[0024] Next, in act 1 04, each segment is analyzed to produce a signature, known herein as a segment feature vector. 
It is contemplated that a wide variety of methods known in the art may be used to analyze ttie segments and generate 
segment feature vectors. In an exemplary non-limiting embodiment of tfie present invention, the segment feature vectors 
may be created using the method described in US Patent #5,918.223 to Blum. 
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Storing the signatures 

[0025] In act 1 06, the segment feature vectors are stored to create a representative signature of the sampled work. 
[0026] Each above-iisted step will now tie shown and described In detail. 

5 [(M)27] Referring now to Figure 2, a diagram of a system suitable for use with the present invention is shown. FIG. 2 
includes a client system 200. It is contemplated that ctient system 200 may comprise a personal computer 202 including 
hardware and software standard in the art to run an operating system such as Microsoft Windows®, MAC OS®, or other 
operating systems standard in the art. Client system 200 may further incfude a database 204 for storing and retrieving 
embodiments of the present invention, it is contemplated that database 204 may comprise hardware and software 

10 standard in the art and may be operatively coupled to PC 202. Database 204 may also be used to store and retrieve 
the works and segments utilized by the present invention. 

[0028] Client system 200 may further include an audio/video (A V) input device 208. AA/ device 208 is operatively 
coupled to PC 202 and is configured to provide works to the present invention which may be stored in traditional audio 
or video formats. It is contemplated that A/V device 208 may comprise hardware and software standard in the art 
configured to rec^ve and sample audio works (including video containing audio information), and provide the sampled 
works to the present invention as digital audio fites. Typically, the AA/ input device 208 would supply raw audio samples 
in a format such as 16-bit stereo PCM fonrraJ. AA/ input device 208 provides an example of means for receiving a 
sampled work. 

[0029] It is contemplated that sampled works may be obtained over the Internet, also. Typically, streaming media over 
20 the Internet is provided by a provider, such as provider 218 of FIG. 2. Provider 218 includes a streaming application 
server 220, configured to retrieve works from database 222 and stream the works in a formats standard in the art, such 
as Real®, Windows Media®, or QuickTime.® The server then provides the streamed works to a web server 224, which 
then provides the streamed work to the Internet 214 through a gateway 216. internet 214 may be any packet-based 
network standard in the art, such as IP, Frame Relay, or ATM. 
25 [0030] To reach the provider 218, the present invention may utilize a cable or DSL head end 212 standard in the art 
operatively, which is coupled to a cable modem or DSL modem 21 0 which is in turn coupled to the system's network 
206. The network 206 may be any netwoi1< standard in the art, such as a LAN provided by a PC 202 configured to run 
software standard in the art. 

[0031 ] It is contemplated that the sampled work received by system 200 may contain audio information from a variety 
30 of sources known in the art, including, without limitation, radio, \he audio portion of a television broadcast, Internet radio, 
the audio portion of an Internet video program or channel, streaming audio from a network audio server, audio delivered 
to personal digital assistants over cellular or wireless communication systems, or cable and satelfite broadcasts. 
[0032] Additionally, It is contemplated that the present invention may be configured to receive and compare segments 
coming from a variety of sources either stored or in real-time. For example, it is contemplated that the pr^nt inventton 
35 may compare a real-time steaming work coming from streaming server 21 8 or A/V device 208 wnth a reference segment 
stored in database 204. 

[0033] Figure 3 shows a diagram showing the segmenting of a work according to the present invention. FIG. 3 includes 
audio information 300 displayed along a time axis 302. FIG. 3 ftirther includes a piuraiity of segments 304, 306, and 308 
taken of audio infomiation 300 over some segment size T. 

[0034] In an exemptary non-limiting embodiment of the present invention, instarrtaneous values of a variety of acoustic 
features are computed at a low level, preferably about 1 00 times a second. Addrtiraialfy, 1 0 M FCCs (cepstral coefficients) 
are computed for each segment. If is contemplated that any number of MFCCs may be computed. Preferably, 5-20 
M FCCs are computed, however, as many as 30 MFCCs may be computed, depending on the need for accuracy versus 

speed. 

45 [0035J in an exemplary non-limiting embodiment of the present invention, the segment-fevei acoustical features com- 
prise statistical measures as disclosed in the '223 patent of these !ow-1evei features calculated over the length of each 
segment. The data structure may store other bookkeeping information as we!i (segment size, hop size, item ID, UPC, etc). 
[0036] As can be seen by inspection of FIG. 3, the segments 304, 306, and 308 may overlap in time. This amount of 
overlap may be represented by measuring the time between the center point of adjacent segments. This amount of time 

so is referred to herein as the hop size of the segments, and is so designated in FIG. 3, By way of example, if the segment 
length T of a given segment is one second, and adjacent segments overlap by 50%, the hop size would be 0,5 second, 
[0037] The hop size may be set during the development of the software. ArWiWonally, the hop sizes of the reference 
database and the real-time segments may be predetennined to facilitate compatibilily. For examf^, the reference 
signatures in the reference database may be precomputed with a fixed hop and segment ^ze, and thus the client 

ss applications should confomn to this segment size and have a hop size w^ich integr^Jy divides the reference signature 
hop size. It is contemplated that one may ajqaeriment vwth a variety of segment sizes in order to balance the tradeoff of 
accuracy with speed of computation for a given plication. 

[0038] The inventors have found that by carefully choosing the hop size of the segments, the accuracy of the identi- 
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f ication process may be significantly increased. Additionally, the inventors have found that the accuracy of the Identification 
process may be increased if the hop size of reference segments and the hop size of segments obtained in real-time are 
each chosen independently. The importance of the hop size of segments may be iflusfrated by examining the process 
for segmenting pre-recorded works and real-time works separately. 

Reference signatures 

[0039] Prior to attempting to identify a given work, a reference database of signatures must be created, When building 
a reference database, a segment length having a period of less than three seconds is preferred. In an exemplary non- 
limiting embodiment of the present invention, the segment lengths have a period ranging from 0.5 seconds to 3 seconds. 
For a reference database, the inventors have found that a hop size of approximately 50% to 100% of the segment size 
is preferred. 

[0040] It is contemplated that the reference signatures may be stored on a database such as database 204 as described 
above. Database 204 and the discussion herein provide an exampte of means for providing a plurality of reference 
signatures each having a segment size and a hop size. 

Real-time signatures 

[0041] The choice of the hop size is important for real-time segments. 

[0042] Figure 4 shows a detailed diagram of a real-time segment according to the present invention. FIG. 4 indudes 
real-time audio information 400 displayed along a time axis 402. FIG. 4 further includes segments 404 and 406 taken 
of audio information 400 over some segment length T. In an exemplary non-limiting embodiment of ttie present Invention, 
the segment length of real-time segments is chosen to range from 0.5 to 3 seconds. 

[0043] As can be seen by inspection of FIG. 4, the hop size of real-time is chosen to be smaller than that of reference 
segments. In an exemplary non-limiting embodiment of the present inventbn, the hop size of real-time segments is less 
than 50% of the segment size. In yet another exemf^ary non-limiting embodiment of the present invention, the real-time 
hop size may be 0.1 seconds. 

[0044] The inventors have found such a small hop size advantageous for the following reasons. The ultimate pu ipose 
of generating real-time segments is to analyze and compare them with the reference segments in the database to look 
for matches. The inventors have found at least two major reasons why a segment of the same audio recording captured 
real-time would not match its counterpart in the database. One is tiiat the broadcast channel does not produce a perfect 
copy of the original. For exampte, the work may be edited or processed or the announcer may talk over part of the work. 
The other reason is that larger segment boundaries may not line up In time with the original segment boundaries of the 
target recordings. 

[0045] The inventors have found that by choosing a smaller hop size, some of the segments will ultimately have time 
boundaries that line up with the original segments, notwithstanding the problems listed above. The segments that line 
up with a "cte^" segment of the work may then be used to make an accurate comparison while those that do not so 
line up may be ignored. The inventors have found that a hc^j size of 0.1 seconds seems to be the maximum that would 
solve this time shifting problem. 

[0046] As mentioned above, once a work has been segmented, the individual segments are then analyzed to produce 
a segment feature vector. Figure 5 Is a diagram showing an overview of how the segment feature vectors may be created 
using the metfjods described in US Patent #5,91 8,223 to Blum, et al. It is contemplated that a variety of analysis methods 
may be useful in the present invention, and many different features may be used to make up the feature vector, Tfje 
inventors have found that the pitch, brightness, bandwidth, and loudness features of the '223 patent to be useful in the 
present invention. Additionally, spectral features may be used analyzed, such as the energy in various spectral bands. 
The inventors have found that the cepstral features (MFCCs) are very robust (more invariant) given the distortions 
typically introduced during broadcast, such as EQ, multi-band compression/limiting, and audio data compression tech- 
niques such as MPS encoding/decoding, etc, 

[0047] tn act 500, the audio segment is sampled to produce a segment, tn act 502, the sampled segment is then 
an^yzed using Fourier Transform techniques to transfomi the signal into ttie frequency domain, in act 504, mel frequency 
filters are applied to the fransfonned signal to extract the signifk;ant audibte characteristics of the spectrum, in act 506, 
a Discrete Cosine Transfonn is appiied which converts the sigrial into mel frequency cepstra) coefficients (MFCCs). 
Finally, in act 508, the MFCCs are then averaged over a predetemiined period. In an exemplary non-limiting embodiment 
of the present invention, this period is approximately orre second. Additionally, other characteristics may be computed 
at this time, such as brightness or louciiess. A segment feature vector is then produced which contains a list containing 
at least the 10 MFCCs corresponding average. 

[0048] The disclosure of FIGS. 3, 4, and 5 provide examples of means for creating a signature of a sampled work 
having a segment size and a hop size. 
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[0049] Figure 6 is a diagram sliowtng a complete signature 600 according to the present invention. Signature 600 
includes a plurality of segment feature vectors 1 througti n generated as shown and described above. Signature 600 
may also include an identification portion containing a unique ID, It is contemplated that the identification portion may 
contain a unique identifier provided by the RIAA (Recording industry Association of America). The identification portion 
s may also contain information such as the UPC (Universal Product Code) of the various products that contain the audio 
corresponding to this signature. Additionally, It is contemplated that the signature 600 may also contain information 
pertaining to the characteristics of the file itself, such as the hop size, segment size, number of segments, etc., which 
may be useful for storing and indexing. 

[0050] Signature 600 may then be stored in a database and used for comparisons. The following computer code in 
'0 the C programming language provides an example of a database structure in memory according to the present invention: 

typedef struct 

float hopSize; /* hop size */ 

IS float segmentSize; /* seginent size •/ 

MPSignature* signatures; /* array of signatures •/ 

[0051 ] The following provides an exampt© of the stmcture of a segment according to the present invention: 

typedef struct 

{ 

char* id; /• unique ID for this audio clip •/ 

long nuraSegnients;/* number of segments ♦/ 
float* features; /* feature array */ 

long size; /* size of per-segitient feature vector */ 

float hopSize; 
float segraentSize ; 
} iMFSignature; 

30 [0052] The discussion of FIG. 6 provides an example of means for storing segments and signatures according to the 
present invention. 

[0053] Figure 7 shows a functional diagram of a cwnparison process according to the present invention. Act 1 of FIG. 
7 shows unknown audio being converted to a signature according to the present invention, in act 2, reference signatures 
are retrieved from a reference database. Finally, the reference signatures are scanned and compared to the unknown 
35 audio signatures to detemiine whether a match exists. This comparison may be accomplished through means known 
in the art. For example, the Euclidean distance between the reference and real-time signature can be computed and 
compared to a thresfiold. 

[0054] It is contemplated that ttie present invention has many beneficial uses, including many outside of the music 
piracy area. For example, the present Invention may be used to verily royalty payments. The verification may take place 
40 at the source or the listener. Also, the present invention may be utilized for the auditing of advertisements, or coliecttng 
Arbitfon<E>-!ike data (who is listening to what). The present Invention may also be used to label the audo recordmgs on 

a user's hard disk or on the web, 

[0055] While embodiments and applications of this invention have been shown and described, it would be apparent 
to those skitJed in the art that many more modifications than mentioned above are possible within the scope defined by 
45 the appended claims. 



Claims 

so 1 . A method for detemrsining an identity of a sampled wori<, said method comprising receiving data of a sampled worit, 
segmenting said data of said sampted work into a plurality of segments whwein each of said segments has prede- 
termined segment size and a predetennined hop size, creating a si^ature of said sampled wori< based upon said 
pluralffy of segments, comparing said signature of said sampled work to a plurality of signatures of reference works, 
and determining said sampled work is one of said reference works based upon said corr^rison, said method 

55 characterized in that said predetermined hop size of said segments of sakf sampled work signature is chosen to 

be less than said hop size of each of said friuralify of reference signatures. 

2. The method of claim 1 , wherein said act of creating a signature of said sampled work comprises cafcuiating segment 
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feature vectors for each segment of said sampled work. 

3. The method of claim 1 , wher^n said act of creating a signature of said sampled worl< includes calculating a piurallty 
of MFCCs for each said segment. 

4. The method of claim 1 , wherein said act of creating a signature of said sampled work includes calculating a plurality 
of acoustical features from the group consisting of at least one of loudness, pitch, brightness, bandwidth, spectrum 
and MFCC coefficients for each said segment. 

5. The method of claim 1 , wherein said sampled work signature comprises a plurality of segments and an identification 
portion. 

5. The method of claim 1 , wherein said plurality of segments of said sampled work signature comprise a segment size 
of approximately 0.5 to 3 seconds. 

7. The method of cJalm 6, wherein said plurality of segments of said sampled work signature comprise a hop size of 
less than 50% of the segment size. 

8. The method of claim 6, wherein said plurality of segments of said sampled work signature comprise a hop size of 
approximately 0.1 seconds. 

9. An apparatus that detemiines an identity of a sampled work, said apparatus comprising circuitry configured to receive 
data of a sampled work, circuitry configured to segment said data of said sampled work into a plurality of segments 
wherein each of said segments has predetermined segment size and a predetermined hop size, circuitry configured 
to create a signature of said sampled work based upon said plurality of segments, circuitry configured to compare 
said signature of said sampled work to a plurality of signatures of reference works, and circuitry configured to 
determine saki sampled work is one of said reference works based upon said comparison, said apparatus charac- 
terized in that said predetermined hop size of said segments of said sampled work signature is chosen to be less 
than said hop size of each of said plurality of reference signatures. 

10. The apparatus of claim 9, vrtierein said circuitry oonfl^red to create a signatajre of said sampled work comprises 
circuitfy configured to calculate sesHifienl feature vectors for each of said plurality of segments of said sampled work. 

11. The apparatus of claim 9, wherein said circuitry configured to create a signature includes calculating a plurality of 
MFCCs for each said segment 

12. The apparatus of claim 9, wherein said circuitry configured to create a signature includes circuitry configured to 
cateulate one of a plurality of acousttoal features selected from a group consisting of loudness, pitch, brightness, 
bandwicfth, spectrum and MFCC coefficients for each of s^d plurality of segments of said sampled works. 

1 3. The ^[^ratus of claim 9, wherein said sampled work si^ture comprises a plurality of segmentsand an kJentif toation 
portion. 

14. The apparahjs of claim 9, wherein said plurality of segments of said sampled work comprise said predetermined 
segmervt size of approxiniately 0.5 to 3 seconds. 

1 5. The apparahjs of claim 1 4, wherein said predetermined hc^ size of said pluraSty of segments of said sampled work 
signature is less than 50% of the segment size. 

1 6. The apparatus of claim 1 4, wherein sard predetemiined hop size of each of said plurality of segments of said sampled 
work sgnature is approximately 0.1 seconds. 

17. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the 
machine to perform a method according to any one of claims 1 to 8. 
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Patentanqpruche 

1. Verfahren zum Emifttdn einer Idenlitat eines abgetastelen Werks, wobei das Verfahren umfasst: das Empfangen 
von Daten eines abgetasteten Werks, das Segmentieren der Daten des abgetasteten Werks irt mehrere Segmente, 

s wobei jedes der Segmente eine vorbestlmmte Segmentgr6l3e und eine vortoestimmte Hop-Size aufwelst, das Er- 

zeugen einer Signatur des abget^teten Werks beruhend auf den mehreren Segmenten, das Vergletehen der 
Signatur des abgetasteten Weri<s mil mehreren Signaturen von Referenzwerlten und beruhend auf dem Vergleich 
das Bestimmen, dass das abgetastete Werk eines der Referenzwerke ist, wobei das Verfahren dadurch gekenn- 
zeichnet ist, dass die vorbestimmte Hop-Size der Segmente der Signatur des abgetasteten Werks kleiner als die 

10 Hop-Size jeder der mehreren Referenzstgnaturen gewahlt wird. 

2. Verfahren nach Anspruch 1 , dadurch gekennzeichnet, dass der Vorgang des Erzeugens einer Signatur des 
abgetasteten Werks das Berechnen von Segmentmerlwnalsfaktoren fur jedes Segment des abgetasteten Werks 
umfasst. 

3. Verfahren nach Anspruch 1, dadurch gekennzeichnet, dass der Vorgang des Erzeugens einer Signatur des 
abgetasteten Werks das Berechnen n>ehrerer MFCC fur jedes Segment umfasst. 

4. Verfahren nach An^rinh 1, dadurch gekennzeichnet, dass der Vorgang des Erzeugens einer Signatur des 
20 abgetasteten Werks das Berechnen mehrerer akusttscher Mer krnaie aus der Gruf^ bestehend aus mindestens 

einem von Lautstarke, Tonhdhe, Heitigkeit, Bandbrerte, Spektrum und iViFCC-Koeffizlenten fur jedes Segment um- 
fasst. 

5. Verfahren nach Anspruch 1 , dadurch gekennzeichnet, dass die Signatur des abgetasteten Werks mehrere Seg- 
25 mente und einen Erkennongsabschnrtt umfasst. 

6. Verfahren nach Anspruch 1 , dadurch gekennzeichnet, dass die mehreren Segmente der Signatur des atgeta- 
steten Werks eine SegmentgrdUe von in etwa 0,5 bis 3 Sekunden unnfassen. 

30 7. Verfahren nach Anspruch 6, dadurch gekennzeichnet, dass die mehreren Segmente der Signatur des abgeta- 
steten Werlcs eine Hop-Size von kleiner als 50% der SegmentgrSBe umfassen. 

8. Verfahren nach Anspruch 6, dadurch gekennzeichnet, dass die mehreren Segmente der Signatur des abgeta- 
steten Werks eine Hop-Size von in etwa 0,1 Sekunden umfassen. 

35 

9. Vorrkihtung, die eine Identitat eines abgetasteten Werks ermittelt, wobei die Vorrichtung umfasst: eine zum Emp- 
fangen von Daten eines abgetasteten Werks ausg^egte Schaltung, eine zum Segmentieren der Daten des abge- 
tasteten Werks in mehrere Segmente ausgeiegte Schaltung, wobei jedes der Segmente eine vorbestimmte Seg- 
mentgroBe und eine vort)estimmte Hop-Size aufweist, eine zum Erzeugen einer Signatur des abgetasteten Werks 

'>o beruhend auf den mehreren Segmenten ausgeiegte S(*altung, eine zum Vergieichen der Signatur des abgetasteten 

Werks mit mehreren Signaturen von Referenzwerken ausgeiegte Schaltung und eine zum vergletohsbasierten 
Bestimmen, dass das abgetastete Werk eines der Referenzwerlte ist, ausgeiegte Schaltung, wobei die Vorrichfung 
dadurch gekennzeichnet ist, dass die vorbestimmte Hop-Size der Segmente der Signatur des abgetasteten 
Werks kleiner als die Hop-Size jeder der mehreren Referenzsignaturen gewahlt ist. 

45 

1 0. Vorrichtung nach Anspruch 9, dadurch gekennzeichnet, dass die zum Erzeugen einer Signatur des abgetasteten 
Werks ausgeiegte Schaltuig eine zum Berechnen von Segmentmerkmafsvektoren fur jedes der mehreren Segmente 
des abgetasteten Werks ausgeiegte Schaltung umfasst. 

50 11. Von-tehtung nach An^nich 9, dadurch gekennzeichnet, dass die zum Erzeugen einer Signatur ausgeiegte Schal- 
tung das Berechnen mehrerer MFCX; fQr jedes Segment umfasst. 

12. Vorrichtung nach Artspruch 9, dadurch gekennzsichnet, dass die zum Erzeug^ einer Signatur ausgeiegte Schal- 
tung eine Schaltung umfasst, die zum Berechnen eines von mehreren akustischen Merkmaien gewShIt aus ^ner 

ss Gnjppe bestehend aus Lautstarke, Tonhohe, Helligkeif, Bandbreite, Spektrum und MFCC-Koeffizienten fQr jedes 

der mehreren Segmente der abgetasteten Werke ausgelegt ist. 

13. Vorrfehtung nach Anspruch 9, dadurch gekennzeichnet, dass die Signatur des abgetasteten Werks mehrere 
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Segmente und einen Erkennungsabschnitt umfasst. 

14. Vorrichtung nach Anspruch 9, dadurch gekennzeichnet, dass die mehreren Segmente des abgetasteten Werks 
die vorbestimmte SegmentgroBe von in etwa 0,5 bis 3 Sekunden umfassen. 

15. Vorrichtung nach Anspruch 14, dadurch gekennzeichnet, dass die vorbestimmte Hop-Size der mehreren Seg- 
mente der Signatur des abgetasteten Werks kleiner als 50% der SegmentgroBe ist. 

16. Vorrichtung nach Anspruch 14, dadurch gekennzeichnet, dass die vorbestimmte Hop-Size jedes der mehreren 

Segmente der Signatur des abgetasteten Werks in etwa 0.1 Sekunden ist. 

17. Maschinenlesbare Programmspeichervorrichtung, die konkret ein Programm von Befehlen verkdrpert, die von der 
iViaschine zum Durchfuhren eines Verfahrens nach einem der AnsprOche 1 bis 8 ausfuhrbar sind. 



Revendications 

1 , Precede pour determiner une idenUt^ d'une oeuvre echantiHonnee, iedit precede comportant la reception de donn^es 
d'une oeuvre EchantiHonnee, la segmentation desdites donn6es de ladfte oeuvre echantiHonnee en une pluralite 
de segments dans lequel chacun destSts segments a une taille de segment pr6determinee et une taille de saut 
pr^determinee, la creation d'une signature de ladite oeuvre echantiHonnee sur fa base de ladite pluralite de segments, 
la comparaison de ladite signature de ladite oeuvre echantiHonnee a une plurality de signatures d' oeuvres de 
reference, et !a determination que ladite oeuvre echantiHonnee est I'une desdites oeuvres de reference sur la base 
de ladite comparaison, Iedit precede etant caracterise en ce que ladite taille de saut predeiermin^e desdits seg- 
ments de ladite signature d'oeuvre echantiHonnee est choisie de manifere & §tre Inferieure a ladite oeuvre de saut 
de chacune de ladite plurality de signatures de reference. 

2. Procede salon la revendicatton 1 , dans lequel ladite action de creation de signature de ladite oeuvre echantiHonnee 
comprend le calcul de vecteurs caracteristiques de segment pour chaque segment de ladite tache echantiHonnee. 

Procede selon la revendicatton 1 , dans lequel ladite action de creation de signature de ladite oeuvre echantiHonnee 
inclut le calcul d'une pluralite de MFCC pour chacun desdits segments. 

Procede selon la revendicatton 1 . dans lequel ladite action de creation de signature de ladite oeuvre echantillonn6e 
inctut le calcul d'une pluralite de caracteristiques acoustiques parmi le groupe constitue d'au moins I'un parmi la 
sonie, ia hauteur tonale, fa brillance, la largeur de bande, le spectre et des coefficients MFCC pour chaque segment, 

Procdde selon la revendication 1 , dans lequel ladite signature d'oeuvre 6diantillonnee comporte une pluralite de 
segments el une partie d'identification, 

Proc6d6 seton la revendication 1 , dans lequel ladite pluralite de segments de ladite signature d'oeuvre echantiHonnee 
comporte une tailie de segment d'approximativanent 0,5 it 3 secondes. 

Proceed selon la revendication 6, dans lequel ladite pluralite de segments de ladite signature d'oeuvre echantiHonnee 
comporte une taille de saut inferieure ^ 50 % de la taille de segment. 

Precede selon la revendication 6, dans iequel ladite pluralite de segments de tadite signature d'oeuvre echantiHonnee 
comporte une taille de saut d'approximativement 0,1 seconde. 

Apparetl qui determine une identite d'une oeuvre echantiHonnee, Iedit appareii comportant un circuit configure pour 
recevoir des donnees d'une oeuvre echantiHonnee, un circuit configure pour segmenter lesdites donnees de iadite 
oeuvre echanWttonnee en une pluralite de segments dans lesquels chacun desdits segments a une taille de segment 
predetemfiinee et une taille de saut predeterminee, un circuit configure pour creer une signature de ladite oeuvre 
echanliltonnee sur la base de ladite pluralite de segments, un circuit configure pour comparer ladte signature de 
ladtte oeuvre echantiHonnee a une pluralite de signaftjres d' oeuvre de reference, et un circuit configure pour de- 
terminer que ladite oeuvre echantiHonnee est I'une desdites oeuvres de reference sur la base de ladite comparaison, 
Iedit apparei! etant caracterise en ce que ladite taille de saut predetemiinee desdits segments de ladite signature 
d'oeuvre echanliltonnee est choisie de maniere ^ §tre inferieure h ladite tache de saut de chacune de ladite pluralite 
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de signatures de reference. 

10. Appareil selon la revendication 9, dans lequel ledit circuit configure pour creer une signature cfe iadite oeuvre 
echantil!onn6e comprend un circuit configure pour calcuter des vecteurs caracteristiques de segments pour chacun 
de Iadite pluralite de segments de ladite oeuvre echantiilonnee. 

11. Appareil seton la revendication 9, dans lequel ledit circuit configure pour creer une signature inclut le calcut d'une 

pluralite de MFCC pour ctiaque segment, 

1 2- Appareil selon fa revendication 9, dans iequel ledit circuit configure pour creer une signature inclut un circuit configure 
pour calculer t'une d'une pluralite de caracteristiques acoustiques selectionn6es parmi te groupe constitue de la 
sonle, la tiauteur tonale, la brillance, la largeur de bande, le spectre et les coefficients MFCC pour chacun de iadite 
plurality de segments desdites oeuvres echantillonnees. 

13. Appareil selon ia revendication 9, dans lequel ladite signature d'oeuvre 6cliantlflonn^ comporte une plurality de 
segments et une partie d'identification. 

1 4. Appareil selon la revendication 9, dans lequel ladite plurality de segments de ladite oeuvre echantiilonnee comporte 
ladite taille de segment pr^ddteiminea d'approximativement 0,5 k 3 secondes. 

15. Appareil selon ia reventBcation 14, dans lequel ladite taitle de saut pr^determinee de ladite pluralite de segments 
de ladite signature d'oeuvre Echantiilonnee est Inf^rfeure ^ 50 % de la taille de segment. 

16. Appareil selon la revendication 14, dais iequel ladite taille de saut predetermine de chacun de ladite pluralite de 
segments de ladite signature d'oeuvre 6chantlllonn6e est approximativement de 0,1 seconde. 

17. Disposltif de memorisation de programme fisibte par machine, integrant de manfere tangible un programme d'ins- 
tructions executable par la machine pour ex6cuter un precede selon I'une quelconque d^ revendications 1 k 8. 
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